125 research outputs found
Learning the dependence structure of rare events: a non-asymptotic study
Assessing the probability of occurrence of extreme events is a crucial issue
in various fields like finance, insurance, telecommunication or environmental
sciences. In a multivariate framework, the tail dependence is characterized by
the so-called stable tail dependence function (STDF). Learning this structure
is the keystone of multivariate extremes. Although extensive studies have
proved consistency and asymptotic normality for the empirical version of the
STDF, non-asymptotic bounds are still missing. The main purpose of this paper
is to fill this gap. Taking advantage of adapted VC-type concentration
inequalities, upper bounds are derived with expected rate of convergence in
O(k^-1/2). The concentration tools involved in this analysis rely on a more
general study of maximal deviations in low probability regions, and thus
directly apply to the classification of extreme data
Identifying groups of variables with the potential of being large simultaneously
Identifying groups of variables that may be large simultaneously amounts to
finding out which joint tail dependence coefficients of a multivariate
distribution are positive. The asymptotic distribution of a vector of
nonparametric, rank-based estimators of these coefficients justifies a stopping
criterion in an algorithm that searches the collection of all possible groups
of variables in a systematic way, from smaller groups to larger ones. The issue
that the tolerance level in the stopping criterion should depend on the size of
the groups is circumvented by the use of a conditional tail dependence
coefficient. Alternatively, such stopping criteria can be based on limit
distributions of rank-based estimators of the coefficient of tail dependence,
quantifying the speed of decay of joint survival functions. Numerical
experiments indicate that the algorithm's effectiveness for detecting
tail-dependent groups of variables is highest when paired with a criterion
based on a Hill-type estimator of the coefficient of tail dependence.Comment: 23 pages, 2 table
On Anomaly Ranking and Excess-Mass Curves
Learning how to rank multivariate unlabeled observations depending on their
degree of abnormality/novelty is a crucial problem in a wide range of
applications. In practice, it generally consists in building a real valued
"scoring" function on the feature space so as to quantify to which extent
observations should be considered as abnormal. In the 1-d situation,
measurements are generally considered as "abnormal" when they are remote from
central measures such as the mean or the median. Anomaly detection then relies
on tail analysis of the variable of interest. Extensions to the multivariate
setting are far from straightforward and it is precisely the main purpose of
this paper to introduce a novel and convenient (functional) criterion for
measuring the performance of a scoring function regarding the anomaly ranking
task, referred to as the Excess-Mass curve (EM curve). In addition, an adaptive
algorithm for building a scoring function based on unlabeled data X1 , . . . ,
Xn with a nearly optimal EM is proposed and is analyzed from a statistical
perspective
Can distress tolerance predict chronic worry? Investigating the relationships among worry, distress tolerance, cognitive avoidance, psychological flexibility, difficulties in emotion regulation, and anxiety sensitivity
According to the avoidance theory of worry proposed by Borkovec, Alcaine, and Behar (2004), chronic worry functions as an avoidance mechanism, enabling an individual to diminish the physiological experience of anxiety by impeding emotional processing of the fear stimulus. Previous research has revealed significant correlations between chronic worry and difficulties in emotion regulation (Salters-Pedneault et al., 2006) as well as anxiety sensitivity (Floyd, Garfield, & LaSota, 2005). Distress tolerance which is significantly related to anxiety sensitivity (Bernstein, Zvolensky, Vujanovic, & Moos, 2009) is strongly associated with many maladaptive avoidance behaviors (Anestis et al., 2007; Linehan, 1993; Timpano et al., 2009; Vujanovic et al., 2011). The present study examined the relationships among these variables, as investigators hypothesized that distress tolerance would be a significant predictor of worry. Undergraduate and graduate Eastern Michigan University students (n = 470) completed several measures via an on-line survey system. Analyses of the data support correlational relationships between anxiety sensitivity, difficulties in emotion regulation, avoidance constructs and worry presented in previous research. Distress tolerance was also found to significantly negatively correlate with worry. Additionally, analyses revealed distress tolerance, psychological flexibility, and cognitive avoidance to be significant predictors of worry. These novel findings add to the literature on the development and maintenance of chronic worry. The discovery of this significant relationship sheds light on avenues for clinical improvement in treating worry. Finally, the present study provides theoretical support for acceptance-based behavioral therapies (ABBTs), which have been yielding promising results for chronic worriers
Principal Component Analysis for Multivariate Extremes
The first order behavior of multivariate heavy-tailed random vectors above large radial thresholds is ruled by a limit measure in a regular variation framework. For a high dimensional vector, a reasonable assumption is that the support of this measure is concentrated on a lower dimensional subspace, meaning that certain linear combinations of the components are much likelier to be large than others. Identifying this subspace and thus reducing the dimension will facilitate a refined statistical analysis. In this work we apply Principal Component Analysis (PCA) to a re-scaled version of radially thresholded observations. Within the statistical learning framework of empirical risk minimization, our main focus is to analyze the squared reconstruction error for the exceedances over large radial thresholds. We prove that the empirical risk converges to the true risk, uniformly over all projection subspaces. As a consequence, the best projection subspace is shown to converge in probability to the optimal one, in terms of the Hausdorff distance between their intersections with the unit sphere. In addition, if the exceedances are re-scaled to the unit ball, we obtain finite sample uniform guarantees to the reconstruction error pertaining to the estimated projection sub-space. Numerical experiments illustrate the relevance of the proposed framework for practical purposes
Regular Variation in Hilbert Spaces and Principal Component Analysis for Functional Extremes
Motivated by the increasing availability of data of functional nature, we
develop a general probabilistic and statistical framework for extremes of
regularly varying random elements in . We place ourselves in a
Peaks-Over-Threshold framework where a functional extreme is defined as an
observation whose -norm is comparatively large. Our goal is to
propose a dimension reduction framework resulting into finite dimensional
projections for such extreme observations. Our contribution is double. First,
we investigate the notion of Regular Variation for random quantities valued in
a general separable Hilbert space, for which we propose a novel concrete
characterization involving solely stochastic convergence of real-valued random
variables. Second, we propose a notion of functional Principal Component
Analysis (PCA) accounting for the principal `directions' of functional
extremes. We investigate the statistical properties of the empirical covariance
operator of the angular component of extreme functions, by upper-bounding the
Hilbert-Schmidt norm of the estimation error for finite sample sizes. Numerical
experiments with simulated and real data illustrate this work.Comment: 29 pages (main paper), 5 pages (appendix
Uniform concentration bounds for frequencies of rare events
New Vapnik and Chervonenkis type concentration inequalities are derived for
the empirical distribution of an independent random sample. Focus is on the
maximal deviation over classes of Borel sets within a low probability region.
The constants are explicit, enabling numerical comparisons.Comment: 11 pages, 1 figur
- âŠ